Dataset statistics
| Number of variables | 20 |
|---|---|
| Number of observations | 8145 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 4549 |
| Duplicate rows (%) | 55.9% |
| Total size in memory | 1.2 MiB |
| Average record size in memory | 160.0 B |
Variable types
| NUM | 9 |
|---|---|
| BOOL | 6 |
| CAT | 5 |
Reproduction
| Analysis started | 2020-08-25 01:03:55.435801 |
|---|---|
| Analysis finished | 2020-08-25 01:04:09.740014 |
| Duration | 14.3 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
veil-type has constant value "0" | Constant |
| Dataset has 4549 (55.9%) duplicate rows | Duplicates |
cap-shape has 454 (5.6%) zeros | Zeros |
stalk-color-above-ring has 432 (5.3%) zeros | Zeros |
gill-color has 1728 (21.2%) zeros | Zeros |
population has 385 (4.7%) zeros | Zeros |
odor has 407 (5.0%) zeros | Zeros |
ring-type has 2780 (34.1%) zeros | Zeros |
cap-color has 168 (2.1%) zeros | Zeros |
habitat has 3151 (38.7%) zeros | Zeros |
stalk-root has 2480 (30.4%) zeros | Zeros |
| Distinct count | 6 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.3484346224677717 |
|---|---|
| Minimum | 0 |
| Maximum | 5 |
| Zeros | 454 |
| Zeros (%) | 5.6% |
| Memory size | 63.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 3 |
| Q3 | 5 |
| 95-th percentile | 5 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 1.604770276 |
|---|---|
| Coefficient of variation (CV) | 0.4792598505 |
| Kurtosis | -1.242066847 |
| Mean | 3.348434622 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -0.2480877888 |
| Sum | 27273 |
| Variance | 2.57528764 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 5 | 3667 | 45.0% | |
| 2 | 3159 | 38.8% | |
| 3 | 828 | 10.2% | |
| 0 | 454 | 5.6% | |
| 4 | 33 | 0.4% | |
| 1 | 4 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 454 | 5.6% | |
| 1 | 4 | < 0.1% | |
| 2 | 3159 | 38.8% | |
| 3 | 828 | 10.2% | |
| 4 | 33 | 0.4% | |
| 5 | 3667 | 45.0% |
| Value | Count | Frequency (%) | |
| 5 | 3667 | 45.0% | |
| 4 | 33 | 0.4% | |
| 3 | 828 | 10.2% | |
| 2 | 3159 | 38.8% | |
| 1 | 4 | < 0.1% | |
| 0 | 454 | 5.6% |
| Distinct count | 9 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.8193984039287905 |
|---|---|
| Minimum | 0 |
| Maximum | 8 |
| Zeros | 432 |
| Zeros (%) | 5.3% |
| Memory size | 63.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 6 |
| median | 7 |
| Q3 | 7 |
| 95-th percentile | 7 |
| Maximum | 8 |
| Range | 8 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.900242173 |
|---|---|
| Coefficient of variation (CV) | 0.3265358446 |
| Kurtosis | 2.517571226 |
| Mean | 5.819398404 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -1.839249759 |
| Sum | 47399 |
| Variance | 3.610920316 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 7 | 4485 | 55.1% | |
| 6 | 1872 | 23.0% | |
| 3 | 576 | 7.1% | |
| 4 | 448 | 5.5% | |
| 0 | 432 | 5.3% | |
| 5 | 192 | 2.4% | |
| 2 | 96 | 1.2% | |
| 1 | 36 | 0.4% | |
| 8 | 8 | 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 432 | 5.3% | |
| 1 | 36 | 0.4% | |
| 2 | 96 | 1.2% | |
| 3 | 576 | 7.1% | |
| 4 | 448 | 5.5% | |
| 5 | 192 | 2.4% | |
| 6 | 1872 | 23.0% | |
| 7 | 4485 | 55.1% | |
| 8 | 8 | 0.1% |
| Value | Count | Frequency (%) | |
| 8 | 8 | 0.1% | |
| 7 | 4485 | 55.1% | |
| 6 | 1872 | 23.0% | |
| 5 | 192 | 2.4% | |
| 4 | 448 | 5.5% | |
| 3 | 576 | 7.1% | |
| 2 | 96 | 1.2% | |
| 1 | 36 | 0.4% | |
| 0 | 432 | 5.3% |
| Distinct count | 12 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.811663597298956 |
|---|---|
| Minimum | 0 |
| Maximum | 11 |
| Zeros | 1728 |
| Zeros (%) | 21.2% |
| Memory size | 63.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 5 |
| Q3 | 7 |
| 95-th percentile | 10 |
| Maximum | 11 |
| Range | 11 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 3.537566635 |
|---|---|
| Coefficient of variation (CV) | 0.7352065587 |
| Kurtosis | -1.284274143 |
| Mean | 4.811663597 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0.06091970843 |
| Sum | 39191 |
| Variance | 12.5143777 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0 | 1728 | 21.2% | |
| 7 | 1500 | 18.4% | |
| 10 | 1203 | 14.8% | |
| 5 | 1052 | 12.9% | |
| 2 | 756 | 9.3% | |
| 3 | 733 | 9.0% | |
| 9 | 492 | 6.0% | |
| 4 | 411 | 5.0% | |
| 1 | 96 | 1.2% | |
| 11 | 86 | 1.1% | |
| 6 | 64 | 0.8% | |
| 8 | 24 | 0.3% |
| Value | Count | Frequency (%) | |
| 0 | 1728 | 21.2% | |
| 1 | 96 | 1.2% | |
| 2 | 756 | 9.3% | |
| 3 | 733 | 9.0% | |
| 4 | 411 | 5.0% | |
| 5 | 1052 | 12.9% | |
| 6 | 64 | 0.8% | |
| 7 | 1500 | 18.4% | |
| 8 | 24 | 0.3% | |
| 9 | 492 | 6.0% |
| Value | Count | Frequency (%) | |
| 11 | 86 | 1.1% | |
| 10 | 1203 | 14.8% | |
| 9 | 492 | 6.0% | |
| 8 | 24 | 0.3% | |
| 7 | 1500 | 18.4% | |
| 6 | 64 | 0.8% | |
| 5 | 1052 | 12.9% | |
| 4 | 411 | 5.0% | |
| 3 | 733 | 9.0% | |
| 2 | 756 | 9.3% |
cap-surface
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| 3 | |
|---|---|
| 2 | |
| 0 | |
| 1 | 4 |
| Value | Count | Frequency (%) | |
| 3 | 3251 | 39.9% | |
| 2 | 2562 | 31.5% | |
| 0 | 2328 | 28.6% | |
| 1 | 4 | < 0.1% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| 3 | 3251 | 39.9% | |
| 2 | 2562 | 31.5% | |
| 0 | 2328 | 28.6% | |
| 1 | 4 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Decimal Number | 8145 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 3 | 3251 | 39.9% | |
| 2 | 2562 | 31.5% | |
| 0 | 2328 | 28.6% | |
| 1 | 4 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Common | 8145 | 100.0% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 3 | 3251 | 39.9% | |
| 2 | 2562 | 31.5% | |
| 0 | 2328 | 28.6% | |
| 1 | 4 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 8145 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| 3 | 3251 | 39.9% | |
| 2 | 2562 | 31.5% | |
| 0 | 2328 | 28.6% | |
| 1 | 4 | < 0.1% |
| Distinct count | 1 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| 0 |
|---|
| Value | Count | Frequency (%) | |
| 0 | 8145 | 100.0% |
gill-attachment
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| 1 | |
|---|---|
| 0 | 210 |
| Value | Count | Frequency (%) | |
| 1 | 7935 | 97.4% | |
| 0 | 210 | 2.6% |
| Distinct count | 6 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.642971147943524 |
|---|---|
| Minimum | 0 |
| Maximum | 5 |
| Zeros | 385 |
| Zeros (%) | 4.7% |
| Memory size | 63.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 4 |
| Q3 | 4 |
| 95-th percentile | 5 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.25170611 |
|---|---|
| Coefficient of variation (CV) | 0.3435948458 |
| Kurtosis | 1.675005546 |
| Mean | 3.642971148 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -1.411394602 |
| Sum | 29672 |
| Variance | 1.566768185 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 4 | 4045 | 49.7% | |
| 5 | 1714 | 21.0% | |
| 3 | 1260 | 15.5% | |
| 2 | 401 | 4.9% | |
| 0 | 385 | 4.7% | |
| 1 | 340 | 4.2% |
| Value | Count | Frequency (%) | |
| 0 | 385 | 4.7% | |
| 1 | 340 | 4.2% | |
| 2 | 401 | 4.9% | |
| 3 | 1260 | 15.5% | |
| 4 | 4045 | 49.7% | |
| 5 | 1714 | 21.0% |
| Value | Count | Frequency (%) | |
| 5 | 1714 | 21.0% | |
| 4 | 4045 | 49.7% | |
| 3 | 1260 | 15.5% | |
| 2 | 401 | 4.9% | |
| 1 | 340 | 4.2% | |
| 0 | 385 | 4.7% |
stalk-surface-above-ring
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| 2 | |
|---|---|
| 1 | |
| 0 | 554 |
| 3 | 24 |
| Value | Count | Frequency (%) | |
| 2 | 5195 | 63.8% | |
| 1 | 2372 | 29.1% | |
| 0 | 554 | 6.8% | |
| 3 | 24 | 0.3% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| 2 | 5195 | 63.8% | |
| 1 | 2372 | 29.1% | |
| 0 | 554 | 6.8% | |
| 3 | 24 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Decimal Number | 8145 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 2 | 5195 | 63.8% | |
| 1 | 2372 | 29.1% | |
| 0 | 554 | 6.8% | |
| 3 | 24 | 0.3% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Common | 8145 | 100.0% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 2 | 5195 | 63.8% | |
| 1 | 2372 | 29.1% | |
| 0 | 554 | 6.8% | |
| 3 | 24 | 0.3% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 8145 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| 2 | 5195 | 63.8% | |
| 1 | 2372 | 29.1% | |
| 0 | 554 | 6.8% | |
| 3 | 24 | 0.3% |
bruises?
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 4755 | 58.4% | |
| 1 | 3390 | 41.6% |
| Distinct count | 9 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.141313689379988 |
|---|---|
| Minimum | 0 |
| Maximum | 8 |
| Zeros | 407 |
| Zeros (%) | 5.0% |
| Memory size | 63.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 5 |
| Q3 | 5 |
| 95-th percentile | 8 |
| Maximum | 8 |
| Range | 8 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.105002343 |
|---|---|
| Coefficient of variation (CV) | 0.5082933825 |
| Kurtosis | -0.775168329 |
| Mean | 4.141313689 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -0.08206752281 |
| Sum | 33731 |
| Variance | 4.431034865 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 5 | 3535 | 43.4% | |
| 2 | 2160 | 26.5% | |
| 7 | 576 | 7.1% | |
| 8 | 576 | 7.1% | |
| 0 | 407 | 5.0% | |
| 3 | 406 | 5.0% | |
| 6 | 257 | 3.2% | |
| 1 | 192 | 2.4% | |
| 4 | 36 | 0.4% |
| Value | Count | Frequency (%) | |
| 0 | 407 | 5.0% | |
| 1 | 192 | 2.4% | |
| 2 | 2160 | 26.5% | |
| 3 | 406 | 5.0% | |
| 4 | 36 | 0.4% | |
| 5 | 3535 | 43.4% | |
| 6 | 257 | 3.2% | |
| 7 | 576 | 7.1% | |
| 8 | 576 | 7.1% |
| Value | Count | Frequency (%) | |
| 8 | 576 | 7.1% | |
| 7 | 576 | 7.1% | |
| 6 | 257 | 3.2% | |
| 5 | 3535 | 43.4% | |
| 4 | 36 | 0.4% | |
| 3 | 406 | 5.0% | |
| 2 | 2160 | 26.5% | |
| 1 | 192 | 2.4% | |
| 0 | 407 | 5.0% |
ring-number
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| 1 | |
|---|---|
| 2 | 600 |
| 0 | 36 |
| Value | Count | Frequency (%) | |
| 1 | 7509 | 92.2% | |
| 2 | 600 | 7.4% | |
| 0 | 36 | 0.4% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| 1 | 7509 | 92.2% | |
| 2 | 600 | 7.4% | |
| 0 | 36 | 0.4% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Decimal Number | 8145 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 1 | 7509 | 92.2% | |
| 2 | 600 | 7.4% | |
| 0 | 36 | 0.4% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Common | 8145 | 100.0% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 1 | 7509 | 92.2% | |
| 2 | 600 | 7.4% | |
| 0 | 36 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 8145 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| 1 | 7509 | 92.2% | |
| 2 | 600 | 7.4% | |
| 0 | 36 | 0.4% |
stalk-surface-below-ring
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| 2 | |
|---|---|
| 1 | |
| 0 | 601 |
| 3 | 287 |
| Value | Count | Frequency (%) | |
| 2 | 4953 | 60.8% | |
| 1 | 2304 | 28.3% | |
| 0 | 601 | 7.4% | |
| 3 | 287 | 3.5% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| 2 | 4953 | 60.8% | |
| 1 | 2304 | 28.3% | |
| 0 | 601 | 7.4% | |
| 3 | 287 | 3.5% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Decimal Number | 8145 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 2 | 4953 | 60.8% | |
| 1 | 2304 | 28.3% | |
| 0 | 601 | 7.4% | |
| 3 | 287 | 3.5% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Common | 8145 | 100.0% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 2 | 4953 | 60.8% | |
| 1 | 2304 | 28.3% | |
| 0 | 601 | 7.4% | |
| 3 | 287 | 3.5% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 8145 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| 2 | 4953 | 60.8% | |
| 1 | 2304 | 28.3% | |
| 0 | 601 | 7.4% | |
| 3 | 287 | 3.5% |
| Distinct count | 5 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.294413750767342 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 2780 |
| Zeros (%) | 34.1% |
| Memory size | 63.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 2 |
| Q3 | 4 |
| 95-th percentile | 4 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 1.801753533 |
|---|---|
| Coefficient of variation (CV) | 0.7852783886 |
| Kurtosis | -1.707784602 |
| Mean | 2.294413751 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -0.2925260816 |
| Sum | 18688 |
| Variance | 3.246315794 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 4 | 3985 | 48.9% | |
| 0 | 2780 | 34.1% | |
| 2 | 1296 | 15.9% | |
| 1 | 48 | 0.6% | |
| 3 | 36 | 0.4% |
| Value | Count | Frequency (%) | |
| 0 | 2780 | 34.1% | |
| 1 | 48 | 0.6% | |
| 2 | 1296 | 15.9% | |
| 3 | 36 | 0.4% | |
| 4 | 3985 | 48.9% |
| Value | Count | Frequency (%) | |
| 4 | 3985 | 48.9% | |
| 3 | 36 | 0.4% | |
| 2 | 1296 | 15.9% | |
| 1 | 48 | 0.6% | |
| 0 | 2780 | 34.1% |
veil-color
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| 2 | |
|---|---|
| 1 | 96 |
| 0 | 96 |
| 3 | 8 |
| Value | Count | Frequency (%) | |
| 2 | 7945 | 97.5% | |
| 1 | 96 | 1.2% | |
| 0 | 96 | 1.2% | |
| 3 | 8 | 0.1% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| 2 | 7945 | 97.5% | |
| 0 | 96 | 1.2% | |
| 1 | 96 | 1.2% | |
| 3 | 8 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Decimal Number | 8145 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 2 | 7945 | 97.5% | |
| 0 | 96 | 1.2% | |
| 1 | 96 | 1.2% | |
| 3 | 8 | 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Common | 8145 | 100.0% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 2 | 7945 | 97.5% | |
| 0 | 96 | 1.2% | |
| 1 | 96 | 1.2% | |
| 3 | 8 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 8145 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| 2 | 7945 | 97.5% | |
| 0 | 96 | 1.2% | |
| 1 | 96 | 1.2% | |
| 3 | 8 | 0.1% |
| Distinct count | 10 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.511479435236342 |
|---|---|
| Minimum | 0 |
| Maximum | 9 |
| Zeros | 168 |
| Zeros (%) | 2.1% |
| Memory size | 63.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 3 |
| median | 4 |
| Q3 | 8 |
| 95-th percentile | 9 |
| Maximum | 9 |
| Range | 9 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 2.548863741 |
|---|---|
| Coefficient of variation (CV) | 0.5649729268 |
| Kurtosis | -0.846329753 |
| Mean | 4.511479435 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.7024968851 |
| Sum | 36746 |
| Variance | 6.496706369 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 4 | 2287 | 28.1% | |
| 3 | 1843 | 22.6% | |
| 2 | 1500 | 18.4% | |
| 9 | 1081 | 13.3% | |
| 8 | 1046 | 12.8% | |
| 0 | 168 | 2.1% | |
| 5 | 144 | 1.8% | |
| 1 | 44 | 0.5% | |
| 7 | 16 | 0.2% | |
| 6 | 16 | 0.2% |
| Value | Count | Frequency (%) | |
| 0 | 168 | 2.1% | |
| 1 | 44 | 0.5% | |
| 2 | 1500 | 18.4% | |
| 3 | 1843 | 22.6% | |
| 4 | 2287 | 28.1% | |
| 5 | 144 | 1.8% | |
| 6 | 16 | 0.2% | |
| 7 | 16 | 0.2% | |
| 8 | 1046 | 12.8% | |
| 9 | 1081 | 13.3% |
| Value | Count | Frequency (%) | |
| 9 | 1081 | 13.3% | |
| 8 | 1046 | 12.8% | |
| 7 | 16 | 0.2% | |
| 6 | 16 | 0.2% | |
| 5 | 144 | 1.8% | |
| 4 | 2287 | 28.1% | |
| 3 | 1843 | 22.6% | |
| 2 | 1500 | 18.4% | |
| 1 | 44 | 0.5% | |
| 0 | 168 | 2.1% |
stalk-shape
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| 1 | |
|---|---|
| 0 |
| Value | Count | Frequency (%) | |
| 1 | 4615 | 56.7% | |
| 0 | 3530 | 43.3% |
| Distinct count | 7 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.510620012277471 |
|---|---|
| Minimum | 0 |
| Maximum | 6 |
| Zeros | 3151 |
| Zeros (%) | 38.7% |
| Memory size | 63.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 5 |
| Maximum | 6 |
| Range | 6 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.72028988 |
|---|---|
| Coefficient of variation (CV) | 1.138797226 |
| Kurtosis | -0.2641483349 |
| Mean | 1.510620012 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.9830076938 |
| Sum | 12304 |
| Variance | 2.95939727 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0 | 3151 | 38.7% | |
| 1 | 2155 | 26.5% | |
| 4 | 1146 | 14.1% | |
| 2 | 832 | 10.2% | |
| 5 | 371 | 4.6% | |
| 3 | 298 | 3.7% | |
| 6 | 192 | 2.4% |
| Value | Count | Frequency (%) | |
| 0 | 3151 | 38.7% | |
| 1 | 2155 | 26.5% | |
| 2 | 832 | 10.2% | |
| 3 | 298 | 3.7% | |
| 4 | 1146 | 14.1% | |
| 5 | 371 | 4.6% | |
| 6 | 192 | 2.4% |
| Value | Count | Frequency (%) | |
| 6 | 192 | 2.4% | |
| 5 | 371 | 4.6% | |
| 4 | 1146 | 14.1% | |
| 3 | 298 | 3.7% | |
| 2 | 832 | 10.2% | |
| 1 | 2155 | 26.5% | |
| 0 | 3151 | 38.7% |
gill-size
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 5626 | 69.1% | |
| 1 | 2519 | 30.9% |
| Distinct count | 5 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.1134438305709025 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 2480 |
| Zeros (%) | 30.4% |
| Memory size | 63.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 3 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.063156529 |
|---|---|
| Coefficient of variation (CV) | 0.9548362475 |
| Kurtosis | 0.07543954876 |
| Mean | 1.113443831 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.9430869608 |
| Sum | 9069 |
| Variance | 1.130301805 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 1 | 3779 | 46.4% | |
| 0 | 2480 | 30.4% | |
| 3 | 1128 | 13.8% | |
| 2 | 563 | 6.9% | |
| 4 | 195 | 2.4% |
| Value | Count | Frequency (%) | |
| 0 | 2480 | 30.4% | |
| 1 | 3779 | 46.4% | |
| 2 | 563 | 6.9% | |
| 3 | 1128 | 13.8% | |
| 4 | 195 | 2.4% |
| Value | Count | Frequency (%) | |
| 4 | 195 | 2.4% | |
| 3 | 1128 | 13.8% | |
| 2 | 563 | 6.9% | |
| 1 | 3779 | 46.4% | |
| 0 | 2480 | 30.4% |
target
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 4228 | 51.9% | |
| 1 | 3917 | 48.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| cap-shape | stalk-color-above-ring | gill-color | cap-surface | veil-type | gill-attachment | population | stalk-surface-above-ring | bruises? | odor | ring-number | stalk-surface-below-ring | ring-type | veil-color | cap-color | stalk-shape | habitat | gill-size | stalk-root | target | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 5 | 7 | 4 | 2 | 0 | 1 | 3 | 2 | 1 | 6 | 1 | 2 | 4 | 2 | 4 | 0 | 5 | 1 | 3 | 1 |
| 1 | 5 | 7 | 4 | 2 | 0 | 1 | 2 | 2 | 1 | 0 | 1 | 2 | 4 | 2 | 9 | 0 | 1 | 0 | 2 | 0 |
| 2 | 0 | 7 | 5 | 2 | 0 | 1 | 2 | 2 | 1 | 3 | 1 | 2 | 4 | 2 | 8 | 0 | 3 | 0 | 2 | 0 |
| 3 | 5 | 7 | 5 | 3 | 0 | 1 | 3 | 2 | 1 | 6 | 1 | 2 | 4 | 2 | 8 | 0 | 5 | 1 | 3 | 1 |
| 4 | 5 | 7 | 4 | 2 | 0 | 1 | 0 | 2 | 0 | 5 | 1 | 2 | 0 | 2 | 3 | 1 | 1 | 0 | 3 | 0 |
| 5 | 5 | 7 | 5 | 3 | 0 | 1 | 2 | 2 | 1 | 0 | 1 | 2 | 4 | 2 | 9 | 0 | 1 | 0 | 2 | 0 |
| 6 | 0 | 7 | 2 | 2 | 0 | 1 | 2 | 2 | 1 | 0 | 1 | 2 | 4 | 2 | 8 | 0 | 3 | 0 | 2 | 0 |
| 7 | 0 | 7 | 5 | 3 | 0 | 1 | 3 | 2 | 1 | 3 | 1 | 2 | 4 | 2 | 8 | 0 | 3 | 0 | 2 | 0 |
| 8 | 5 | 7 | 7 | 3 | 0 | 1 | 4 | 2 | 1 | 6 | 1 | 2 | 4 | 2 | 8 | 0 | 1 | 1 | 3 | 1 |
| 9 | 0 | 7 | 2 | 2 | 0 | 1 | 3 | 2 | 1 | 0 | 1 | 2 | 4 | 2 | 9 | 0 | 3 | 0 | 2 | 0 |
Last rows
| cap-shape | stalk-color-above-ring | gill-color | cap-surface | veil-type | gill-attachment | population | stalk-surface-above-ring | bruises? | odor | ring-number | stalk-surface-below-ring | ring-type | veil-color | cap-color | stalk-shape | habitat | gill-size | stalk-root | target | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 8135 | 0 | 7 | 2 | 3 | 0 | 1 | 3 | 2 | 1 | 3 | 1 | 2 | 4 | 2 | 9 | 0 | 3 | 0 | 2 | 0 |
| 8136 | 2 | 7 | 2 | 0 | 0 | 1 | 4 | 2 | 0 | 5 | 1 | 2 | 4 | 2 | 3 | 0 | 5 | 1 | 3 | 0 |
| 8137 | 5 | 7 | 7 | 0 | 0 | 1 | 4 | 2 | 1 | 3 | 1 | 2 | 4 | 2 | 9 | 1 | 0 | 1 | 1 | 0 |
| 8138 | 5 | 7 | 2 | 2 | 0 | 1 | 2 | 2 | 1 | 0 | 1 | 2 | 4 | 2 | 9 | 0 | 3 | 0 | 2 | 0 |
| 8139 | 2 | 7 | 5 | 0 | 0 | 1 | 5 | 2 | 0 | 5 | 1 | 2 | 4 | 2 | 4 | 0 | 5 | 1 | 3 | 0 |
| 8140 | 5 | 7 | 3 | 0 | 0 | 1 | 3 | 0 | 0 | 5 | 1 | 2 | 0 | 2 | 3 | 1 | 1 | 0 | 3 | 0 |
| 8141 | 5 | 7 | 7 | 3 | 0 | 1 | 3 | 2 | 1 | 3 | 1 | 3 | 4 | 2 | 9 | 0 | 4 | 0 | 4 | 0 |
| 8142 | 2 | 7 | 7 | 2 | 0 | 1 | 4 | 2 | 1 | 3 | 1 | 2 | 4 | 2 | 8 | 1 | 0 | 1 | 1 | 0 |
| 8143 | 5 | 7 | 4 | 2 | 0 | 1 | 3 | 2 | 1 | 3 | 1 | 2 | 4 | 2 | 9 | 0 | 1 | 0 | 2 | 0 |
| 8144 | 2 | 7 | 7 | 3 | 0 | 1 | 5 | 2 | 1 | 0 | 1 | 3 | 4 | 2 | 9 | 0 | 4 | 0 | 4 | 0 |
Most frequent
| cap-shape | stalk-color-above-ring | gill-color | cap-surface | veil-type | gill-attachment | population | stalk-surface-above-ring | bruises? | odor | ring-number | stalk-surface-below-ring | ring-type | veil-color | cap-color | stalk-shape | habitat | gill-size | stalk-root | target | count | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 232 | 2 | 3 | 5 | 0 | 0 | 1 | 4 | 2 | 1 | 5 | 1 | 2 | 4 | 2 | 2 | 1 | 0 | 0 | 1 | 0 | 6 |
| 233 | 2 | 3 | 5 | 0 | 0 | 1 | 4 | 2 | 1 | 5 | 1 | 2 | 4 | 2 | 3 | 1 | 0 | 0 | 1 | 0 | 6 |
| 234 | 2 | 3 | 5 | 0 | 0 | 1 | 4 | 2 | 1 | 5 | 1 | 2 | 4 | 2 | 4 | 1 | 0 | 0 | 1 | 0 | 6 |
| 235 | 2 | 3 | 5 | 0 | 0 | 1 | 5 | 2 | 1 | 5 | 1 | 2 | 4 | 2 | 2 | 1 | 0 | 0 | 1 | 0 | 6 |
| 236 | 2 | 3 | 5 | 0 | 0 | 1 | 5 | 2 | 1 | 5 | 1 | 2 | 4 | 2 | 3 | 1 | 0 | 0 | 1 | 0 | 6 |
| 237 | 2 | 3 | 5 | 0 | 0 | 1 | 5 | 2 | 1 | 5 | 1 | 2 | 4 | 2 | 4 | 1 | 0 | 0 | 1 | 0 | 6 |
| 238 | 2 | 3 | 5 | 3 | 0 | 1 | 4 | 2 | 1 | 5 | 1 | 2 | 4 | 2 | 2 | 1 | 0 | 0 | 1 | 0 | 6 |
| 239 | 2 | 3 | 5 | 3 | 0 | 1 | 4 | 2 | 1 | 5 | 1 | 2 | 4 | 2 | 3 | 1 | 0 | 0 | 1 | 0 | 6 |
| 240 | 2 | 3 | 5 | 3 | 0 | 1 | 4 | 2 | 1 | 5 | 1 | 2 | 4 | 2 | 4 | 1 | 0 | 0 | 1 | 0 | 6 |
| 241 | 2 | 3 | 5 | 3 | 0 | 1 | 5 | 2 | 1 | 5 | 1 | 2 | 4 | 2 | 2 | 1 | 0 | 0 | 1 | 0 | 6 |